Improving Object Detection Quality by Incorporating Global Contexts via Self-Attention

نویسندگان

چکیده

Fully convolutional structures provide feature maps acquiring local contexts of an image by only stacking numerous layers. These are known to be effective in modern state-of-the-art object detectors such as Faster R-CNN and SSD find objects from contexts. However, the quality can further improved incorporating global when some ambiguous should identified surrounding or background. In this paper, we introduce a self-attention module for incorporate More specifically, our allows extractor compute with mechanism. Our computes relationships among all elements maps, then blends considering computed relationships. Therefore, capture long-range backgrounds, which is difficult fully structures. Furthermore, proposed not limited any specific detectors, it applied CNN-based model computer vision task. experimental results on detection task, method shows remarkable gains average precision (AP) compared popular models that have particular, ResNet-50 backbone, same backbone achieved +4.0 AP without bells whistles. semantic segmentation panoptic tasks, performance metrics used each

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Attention-based Object Detection

متن کامل

Improving predictive models of glaucoma severity by incorporating quality indicators

OBJECTIVE In this paper we present an evaluation of the role of reliability indicators in glaucoma severity prediction. In particular, we investigate whether it is possible to extract useful information from tests that would be normally discarded because they are considered unreliable. METHODS We set up a predictive modelling framework to predict glaucoma severity from visual field (VF) tests...

متن کامل

Object detection by global contour shape

We present a method for object class detection in images based on global shape. A distance measure for elastic shape matching is derived, which is invariant to scale and rotation, and robust against non-parametric deformations. Starting from an over-segmentation of the image, the space of potential object boundaries is explored to find boundaries, which have high similarity with the shape templ...

متن کامل

Reducing Medical Costs and Improving Quality via Self-Management Tools

April 2007 | Volume 4 | Issue 4 | e104 In 2004, health-care providers in the United States consumed $US1.9 trillion, or about 16% of the gross domestic product. By 2010, the cost of health care is predicted to exceed 20% of the US gross domestic product [1]. The management of chronic diseases currently accounts for 70%–75% of health-care spending [2], and this proportion is likely to increase i...

متن کامل

Incorporating Global Visual Features into Attention-based Neural Machine Translation

We introduce multi-modal, attentionbased Neural Machine Translation (NMT) models which incorporate visual features into different parts of both the encoder and the decoder. Global image features are extracted using a pre-trained convolutional neural network and are incorporated (i) as words in the source sentence, (ii) to initialise the encoder hidden state, and (iii) as additional data to init...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Electronics

سال: 2021

ISSN: ['2079-9292']

DOI: https://doi.org/10.3390/electronics10010090